Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate API calls from install logic #29

Merged
merged 19 commits into from
Jan 12, 2024
Merged

Separate API calls from install logic #29

merged 19 commits into from
Jan 12, 2024

Conversation

MillironX
Copy link
Member

Background

This is phase two of my four-part plan to fix the gripes with this action

This PR

Separate API calls from install logic

Merry Christmas, nf-core team! Here is a major overhaul to the action code, which will allow us to add mocks, create unit tests, and potentially even substitute caches or external APIs when the GitHub API hits a rate limit. I might want to add a few more tests before merging, but would love feedback now.

@MillironX MillironX marked this pull request as draft December 23, 2023 18:02
@MillironX MillironX marked this pull request as ready for review January 6, 2024 16:26
@MillironX
Copy link
Member Author

Hey, @edmundmiller. Everything is ready to go, except for the filename lint rules. I can't find what these rules are, or where they come from. Could you point me in the right direction, and then I can fix that?

@mashehu
Copy link
Contributor

mashehu commented Jan 9, 2024

Hey, @edmundmiller. Everything is ready to go, except for the filename lint rules. I can't find what these rules are, or where they come from. Could you point me in the right direction, and then I can fix that?

These linting rules are quite hidden: the culprit was this one: https://github.com/github/eslint-plugin-github/blob/main/lib/configs/recommended.js#L21

@edmundmiller edmundmiller added this to the 2.0.0 milestone Jan 9, 2024
Copy link
Collaborator

@edmundmiller edmundmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!

I'm wondering if there's any way we can "sniff" the number of API requests in the tests? If there's not a quick resolution let's just make a follow up issue.

@mashehu
Copy link
Contributor

mashehu commented Jan 9, 2024

You can read the rate limit in the response header under x-ratelimit-remaining

@MatthiasZepper
Copy link
Member

You can read the rate limit in the response header under x-ratelimit-remaining

Or avoid reinventing the wheel :-)

@edmundmiller
Copy link
Collaborator

@MatthiasZepper where have you been hiding that???

But I was actually talking about in this code base to see how many we're using and then do a back of the napkin calculation.

@MillironX
Copy link
Member Author

MillironX commented Jan 10, 2024

I'm wondering if there's any way we can "sniff" the number of API requests in the tests?

Not sure why? 😕 There are exactly threefive API calls made by the test suite now: all other "API" tests now use static objects.

@edmundmiller
Copy link
Collaborator

Not sure why? 😕 There are exactly threefive API calls made by the test suite now: all other "API" tests now use static objects.

To make sure we don't introduce any regressions because it's been a huge pain during hackathons and for Sarek, and we're about to push it to the limit with the scatter CI in methylseq that I hope to add to the template.

@MatthiasZepper
Copy link
Member

MatthiasZepper commented Jan 10, 2024

@MatthiasZepper where have you been hiding that???

In plain sight, obviously. Bragged about it a lot - so often that I already felt bad about it.

Regarding scatter: Just mind that the billable minute are calculated per job level. As far as I know, the nf-core organization doesn't have to pay for the Actions' runtime, but should GitHub ever change its pricing model, we need to be careful as separating everything into nano-jobs will have a great impact then.

@MillironX
Copy link
Member Author

To make sure we don't introduce any regressions

Regressions in what, exactly? You're worried that the action is going to today use, say 5 API calls, and then we might break it by introducing a bug that uses, say 10 API calls?

because it's been a huge pain during hackathons and for Sarek...

I get that, but that's during unit testing of pipelines, not of this action, right? So in that case, trying to find out the timeouts beforehand would happen exactly as we've talked about in #19, and we can add timeout and cooldown abilities to OctokitWrapper easily now.

@edmundmiller
Copy link
Collaborator

Regarding scatter

Yeah I think we're going to have to cross that bridge if they change the pricing model.

I'm not following though, you're saying if they charge per job? I think the math would come out pretty close, the tests are taking 1-2 minutes a piece.

The way it's broken up, we could shard them into 4 sets as well, instead of 20 jobs. That could be a better idea anyway because we'll start saturating the free runners quickly.

@edmundmiller
Copy link
Collaborator

Regressions in what, exactly? You're worried that the action is going to today use, say 5 API calls, and then we might break it by introducing a bug that uses, say 10 API calls?

Exactly!

Anyway it was just a wish list item, I made #30 if we ever get time/motivation to follow up so good to merge with me @MillironX !

@MatthiasZepper
Copy link
Member

MatthiasZepper commented Jan 12, 2024

Firstly, sorry @MillironX for hijacking the discussion on your PR. I am neutral and will not submit a review on this PR, since I not even remotely understand all the consideration that went into it.

I'm not following though, you're saying if they charge per job? I think the math would come out pretty close, the tests are taking 1-2 minutes a piece.

I was referring to this section from the HowTo article on Token rotation from Shopify:

Our original cost estimates included certain calculation assumptions and misconceptions that ended up not holding true in production.

The first assumption was mostly due to a lack of research: the granularity of what GitHub refers to as “billable minutes”. Understanding that the granularity of a billable minute is at, well, the minute-level was one thing; the other thing was the way that the rounding and bucketing works.

Billable minute calculations are not performed at an organization level, but rather at the job level. Each workflow has N underlying jobs, and each underlying job has an execution duration that is rounded to the nearest minute. These rounded nearest minutes are then summed together–even if the jobs were run in parallel–and billed to the organization. Because of this, workflows that ran 10 parallel jobs that would execute in 1 second each would end up accruing 10 billable minutes.

These miscalculations ended up exploding our costs during our prototyping phase and led to an architecture refactoring to sequentially executing all downstream workflows in a single job.

@MillironX MillironX merged commit b30f81e into nf-core:master Jan 12, 2024
16 checks passed
@MillironX MillironX deleted the feature/release-model branch January 12, 2024 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants